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Claims: 



1 ^ A method of training a scoring matrix for use by a classification system, the 

2 classification system for use in performing classification requests based on natural 

3 language text and with use of said scoring matrix which has been based on a set of 

4 training data comprising natural language text, the method comprising the steps of: 

5 generating an initial scoring matrix comprising a numerical value for each of a 

6 set of n classes in association with each of a set of m features, the initial scoring matrix 

7 based on said set of training data and, for each element of said set of training data. 

8 based on a subset of said features which are comprised in said natural language text of 

9 said element of said set of training data and on one of said classes which has been 
O 1 0 identified therefor; and 

-^l 1 1 based on the initial scoring matrix and said set of training data, generating a 

2^ 12 discriminatively trained scoring matrix for use by said classification system by 

13 adjusting one or more of said numerical values such that a greater degree of 

hi 

p 14 discrimination exists between competing ones of said classes when said classification 

- 15 requests are performed, thereby resulting in a reduced classification error rate. 



nj 
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1 2. The method of claim 1 wherein said step of adjusting said numerical values is 

2 performed with use of a Generalized Probabilistic Descent algorithm. 

1 3. The method of claim 2 wherein said step of adjusting said numerical values 

2 comprises iteratively adjusting said numerical values until a stopping criterion is met. 

1 4. The method of claim 3 wherein said stopping criterion comprises an empirical 

2 loss threshold . 

1 5. The method of claim 3 wherein said stopping criterion comprises a classification 

2 error rate threshold. 
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6. The method of claim 2 wherein said step of adjusting said numerical values 
comprises, for each element of said set of training data, modifying values associated 
with the identified class and values associated with one or more of the other classes 
such that a score obtained for said element of said set of training data based on said 
modified values associated with the identified class is improved relative to one or more 
scores obtained for said element of said set of training data based on said modified 
values associated with said one or more other classes. 

7. The method of claim 6 wherein said values are modified based on the following 
equation of the form: 



r- (^+1) = 



dl. 



if V = k 



\f V ^ k 



8. The method of claim 1 wherein said classification requests comprise call routing 
requests, wherein said classes comprise call routing destinations, and wherein said set of 
training data comprises call routing requests together with associated identified call 
routing destinations therefor. 



9. The method of claim 1 wherein said classification requests comprise document 
retrieval requests, wherein said classes comprise documents, and wherein said set of 
training data comprises document retrieval requests together with associated identified 
documents therefor. 
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10. The method of claim 1 wherein said natural language text upon which said 
classification requests are based and said natural language text comprised in said set of 
training data is processed without the use of stop word filtering. 



A method of performing classification requests based on natural language text 
and with use of a discriminatively trained scoring matrix which has been trained based 
on a set of training data comprising natural language text, the scoring matrix having 
been discriminatively trained by a method comprising the steps of: 

generating an initial scoring matrix comprising a numerical value for each of a 
set of n classes in association with each of a set of m features, the initial scoring matrix 
based on said set of training data and, for each element of said set of training data, 
based on a subset of said features which are comprised in said natural language text of 
said element of said set of training data and on one of said classes which has been 
identified therefor; and 

based on the initial scoring matrix and said set of training data, generating said 
discriminatively trained scoring matrix by adjusting one or more of said numerical 
values such that a greater degree of discrimination exists between competing ones of 
said classes when said classification requests are performed, thereby resulting in a 
reduced classification error rate. 

12. The method of claim 1 1 wherein said step of adjusting said numerical values is 
performed with use of a Generalized Probabilistic Descent algorithm. 

13. The method of claim 12 wherein said step of adjusting said numerical values 
comprises iteratively adjusting said numerical values until a stopping criterion is met. 

14. The method of claim 13 wherein said stopping criterion comprises an empirical 
loss threshold . 
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15. The method of claim 13 wherein said stopping criterion comprises a 
classification error rate threshold. 

16. The method of claim 12 wherein said step of adjusting said numerical values 
comprises, for each element of said set of training data, modifying values associated 
with the identified class and values associated with one or more of the other classes 
such that a score obtained for said element of said set of training data based on said 
modified values associated with the identified class is improved relative to one or more 
scores obtained for said element of said set of training data based on said modified 
values associated with said one or more other classes. 

17. The method of claim 16 wherein said values are modified based on the 
following equation of the form: 



(^+0 = 



5^ 

r.AO , — if V ^ ^ 



18. The method of claim 1 1 wherein said classification requests comprise call 
routing requests, wherein said classes comprise call routing destinations, and wherein 
said set of training data comprises call routing requests together with associated 
identified call routing destinations therefor. 

19, The method of claim 1 1 wherein said classification requests comprise document 
retrieval requests, wherein said classes comprise documents, and wherein said set of 
training data comprises document retrieval requests together with associated identified 
documents therefor. 
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20. The method of claim 1 1 wherein said natural language text upon which said 
classification requests are based and said natural language text comprised in said set of 
training data is processed without the use of stop word filtering. 



An apparatus for training a scoring matrix for use by a classification system, the 
classification system for use in performing classification requests based on natural 
language text and with use of said scoring matrix which has been based on a set of 
training data comprising natural language text, the apparatus comprising: 

means for generating an initial scoring matrix comprising a numerical value for 
each of a set of n classes in association with each of a set of m features, the initial 
scoring matrix based on said set of training data and, for each element of said set of 
training data, based on a subset of said features which are comprised in said natural 
language text of said element of said set of training data and on one of said classes 
which has been identified therefor; and 

based on the initial scoring matrix and said set of training data, means for 
generating a discriminatively trained scoring matrix for use by said classification 
system by adjusting one or more of said numerical values such that a greater degree of 
discrimination exists between competing ones of said classes when said classification 
requests are performed, thereby resulting in a reduced classification error rate. 

22. The apparatus of claim 21 wherein said means for adjusting said numerical 
values executes a Generalized Probabilistic Descent algorithm. 

23. The apparatus of claim 22 wherein said means for adjusting said numerical 
values comprises means for iteratively adjusting said numerical values until a stopping 
criterion is met. 

24. The apparatus of claim 23 wherein said stopping criterion comprises an 
empirical loss threshold . 
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1 25. The apparatus of claim 23 wherein said stopping criterion comprises a 

2 classification error rate threshold. 

1 26. The apparatus of claim 22 wherein said means for adjusting said numerical 

2 values comprises, for each element of said set of training data, means for modifying 

3 ■ values associated with the identified class and values associated with one or more of the 

4 other classes such that a score obtained for said element of said set of training data 

5 based on said modified values associated with the identified class is improved relative 

6 to one or more scores obtained for said element of said set of training data based on said 

7 modified values associated with said one or more other classes. 



27. The apparatus of claim 26 wherein said values are modified based on the 
following equation of the form: 



^ (^+1) = 



5^ 
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if V = k 



V ^ k 
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28. The apparatus of claim 21 wherein said classification requests comprise call 
routing requests, wherein said classes comprise call routing destinations, and wherein 
said set of training data comprises call routing requests together with associated 
identified call routing destinations therefor. 



1 

2 

J 

4 



29. - The apparatus of claim 21 wherein said classification requests comprise 
document retrieval requests, wherein said classes comprise documents, and wherein 
said set of training data comprises document retrieval requests together with associated 
identified documents therefor. 
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30. The apparatus of claim 21 wherein said natural language text upon which said 
classification requests are based and said natural language text comprised in said set of 
training data is processed without the use of stop word filtering. 

}f^. An apparatus for performing classification requests based on natural language 
text and with use of a discriminatively trained scoring matrix which has been trained 
based on a set of training data comprising natural language text, the scoring matrix 
having been discriminatively trained by an apparatus comprising: 

means for generating an initial scoring matrix comprising a numerical value for 
each of a set of n classes in association with each of a set of m features, the initial 
scoring matrix based on said set of training data and, for each element of said set of 
training data, based on a subset of said features which are comprised in said natural 
language text of said element of said set of training data and on one of said classes 
which has been identified therefor; and 

based on the initial scoring matrix and said set of training data, means for 
generating said discriminatively trained scoring matrix by adjusting one or more of said 
numerical values such that a greater degree of discrimination exists between competing 
ones of said classes when said classification requests are performed, thereby resulting in 
a reduced classification error rate. 




32. The apparatus of claim 31 wherein said means for adjusting said numerical 
values executes a Generalized Probabilistic Descent algorithm. 

33. The apparatus of claim 32 wherein said means for adjusting said numerical 
values comprises means for iteratively adjusting said numerical values until a stopping 
criterion is met. 

34. The apparatus of claim 33 wherein said stopping criterion comprises an 
empirical loss threshold . 
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35. The apparatus of claim 33 wherein said stopping criterion comprises a 
classification error rate threshold. 

36. The apparatus of claim 32 wherein said means for adjusting said numerical 
values comprises, for each element of said set of training data, means for modifying 
values associated with the identified class and values associated with one or more of the 
other classes such that a score obtained for said element of said set of training data 
based on said modified values associated with the identified class is improved relative 
to one or more scores obtained for said element of said set of training data based on said 
modified values associated with said one or more other classes. 

37. The apparatus of claim 16 wherein said values are modified based on the 
following equation of the form: 

ifv = k 



\^ V ^ k 

38. The apparatus of claim 31 wherein said classification requests comprise call 
routing requests, wherein said classes comprise call routing destinations, and wherein 
said set of training data comprises call routing requests together with associated 
identified call routing destinations therefor. 

39. The apparatus of claim 31 wherein said classification requests comprise 
document retrieval requests, wherein said classes comprise documents, and wherein 
said set of training data comprises document retrieval requests together with associated 
identified documents therefor. 
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40. The apparatus of claim 31 wherein said natural language text upon which said 
classification requests are based and said natural language text comprised in said set of 
training data is processed without the use of stop word filtering. 
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